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Introduction 

This research investigated ways in which computers can aid the decision making of 
an human operator of an aerospace system. The approach taken is to aid rather than 
replace the h uman operator, because operational experience has shown that humans can 
enhance the effectiveness of systems. As systems become more automated, the role of the 
operator has shifted to that of a manager and problem solver. This shift has created the 
research area of how to aid the human in this role. 


The remainder of this report describes published research in four areas. It 
concludes with a discussion of the DC-8 flight simulator at Georgia Tech. 
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problems in implementing automation. Proceedings ot the 1985 International 
Conference on Systems. Man, and Cybernetics- 
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Model-Based Online Aiding [5] 

This research addressed the feasibility of adapting an existing rule-based system as 
an online "coach" for controlling PLANT, a simulation of a generic process plant. KARL, 
a rule-based model capable of controlling PLANT, was adapted to provide three types of 
information to subjects: 

1) situation assessment (i.e., which operational procedure, if any, was applicable for a 
given situation); 

2) guidance in following procedures (i.e., feedback whenever subjects’ actions were 
inconsistent with available procedures); and 

3) performance feedback (based upon changes in the system’s stability). 

Subjects received this information online while controlling PLANT. Compared to subjects 
in an earlier experiment who controlled PLANT without the benefit of the coach, these 
subjects maintained a generally more stable system, scored higher on a paper-and-pencil 
test of system knowledge, and were more successful in diagnosing an unfamiliar failure of 
the PLANT safety system. Careful analysis of these results in light of previous research 
with PLANT indicated that the reasons for these differences were not as straightforward as 
they might appear. This experiment is viewed as illustrating potential benefits and 
subtleties of using a rule-based model as an online coach. 
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Significance Testing of Rule- Based Models [1] 


Many researchers have used rule-based systems to model human problem solving. 
Typically, the rule-based system has a large number of rules, each of which has several free 
variables that were adjusted during the modeling process. For the most part, significance 
testing of these rules has not been much of a consideration, although it should be. It is 
possible to describe N data perfectly with N rules using a trivial model that simply 
reproduces the data. While there is no evidence that this has happened in any of the 
research reported to date, there is a certain danger of overfitting a rule-based model. 

Three methods were developed for testing the statistical significance of rules and 
other components of rule-based models. It was assumed that the percentage of behavior 
matched (e.g., commands) was the performance measure of interest. Two of the testing 
approaches, however, were not limited to this measure. They may be used to study any 
performance measure, though it may be possible for a rule to produce a statistically 
significant effect on one performance measure but not another. Rule testing by analysis of 
variance, randomization, and contingency tables was studied, and comparisons between 
these methods were developed. 

Identification of Rule-Based Model s of Problem Solving [6, 7] 

Rule-based models have frequently been used to model human performance and 
behavior. A machine learning program was used to identify the rules employed by humans 
in two settings. The first setting was a collision avoidance maneuver for which the pilots 
had a cockpit display of traffic information (CDTI). This data was generated from an 
experiment to evaluate the effects of various CDTI displays on avoidance behavior. 
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The rules produced by the machine learning program can be combined in a decision 
sequence that accounts for a substantial portion of the maneuvers. When the intruder was 
maintaining a constant altitude, pilots executed vertical away maneuvers even for intruders 
posing no threat. This is the easiest of the maneuver decisions because it entails no 
geometric complications and was used whenever possible. For intruders changing altitude, 
a minority of pilots consistently checked for a threatening separation and remained on 
course if none existed. Another subgroup responded to horizontal threats by uniformly 
turning toward the intruder. This is a good decision if the intruder would have passed in 
front but aggravates the situation for intruders which would pass behind. The remainder of 
the pilots included this qualification in their decisions to turn toward the intruder. The 
mirror 0 f this strategy, turning away from intruders which would pass behind was not 

observed. 


The second setting was PLANT [Morris, N.M., and Rouse, W.B. (1985). "The 
effects of type of knowledge upon human problem solving in a process control task. 
TF.F.F, Transactions on .Systems. M an, and Cybernetics. SMC-15(6).], a simulated industrial 
process in which feedstock is pumped in at one end and the finished product is pumped out 
at the other. A three-by-three matrix of tanks connects PLANT input to output. Each tank 
is connected by valves to all tanks in adjacent columns. The operator controls valve 
positions and pumping rates for feedstock and product. Fluid dynamics are modeled 
within the system causing lags and oscillations to result when valves change state, as well as 
varying rates of flow due to relative tank heights. Valves trip closed when flow exceeds 
their setpoints. Failures of pumps and valves are also possible. The CRT system display 
shows tanks, their levels of fluid, open valves connecting the tanks, and numerical labels 
showing pumping rates and tank levels. 
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In concert these features produce a complex symbolic task in which conflicting goals 
relating production, system stability, long term trends, failures, and trips must be balanced 
to operate the system. At peak efficiency, all valves should be open, tank levels uniform 
across the system, and identically high pumping rates set for feedstock and product. 
PLANT is operated by subjects through a services of iterations which a control action is 
entered and the resultant updated system state displayed. The iterations from an 
experimental session (-500) provide a series of "snapshots” isolating specific system states 
and the responses subjects made to them. 

In an initial analysis of this data [8], small sets of high coverage rules were 
assembled. Cross-validation was used to assess the reliability of the selected rules. 
Identified rules correctly matched 51% of control decisions in the identification sample for 
subjects in the control group and 32% of the control decisions in the validation sample. 
For subjects using PLANT procedures, combining symbolic (rule-based) and signal 
(internal dynamic model of PLANT) processing fared better matching control decisions 
52% of the time. The generality of the well-performing rules obtained prohibited the 
detailed analysis of strategy possible in the CDTI case. 

Deep Reasoning Fault Diagnosis [2, 3, 4, 9, 11] 

This research studied the design and evaluation of knowledge-based aiding for a 
human operator who must diagnose a novel fault in a dynamic, physical system. Since the 
operator must employ deep reasoning about system behavior to diagnose such a fault, his 
or her performance may be restricted by cognitive limitations and biases. A computer aid 
based on a qualitative model of the system was built to help the operator overcome some 
of these limitations. This aid differs from most expert systems in that it operates at several 
levels of interaction which are believed to be more suitable for deep reasoning. 
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Four aiding approaches, each of which provided unique information to the 
operator, were evaluated. The aiding features were designed to help the human s causal 
reasoning about the system in predicting normal system behavior (N aiding), integrating 
observations into actual system behavior (O aiding), finding discrepancies between th two 
(O-N aiding), or finding discrepancies between observed behavior and hypothetical 
behavior (O-H aiding). Three experiments were conducted to evaluate the aiding 
approaches and to investigate the nature of deep-reasoning diagnosis. Human diagnostic 
performance improved by almost a factor of two with O aiding and O-N aiding. The 
results from the experiments were integrated into a model of human information 
processing in causal reasoning diagnosis. 

DC-8 Flight Simulator 

The failure to both complete and utilize the DC-8 flight simulator is a 
disappointment. An assessment of the cost of developing the simulation should have been 
prepared initially. The development breaks down into three categories, hardware, flight 
simulation, and display generation. The hardware category was completed at a cost of 
roughly $75,000. The flight simulation code is roughly one half done, and perhaps another 
10,000 lines of code need to be written and tested. This would require one programmer- 
year to produce ($50,000). Display generation would require $15,000 in hardware and 
another programmer-year ($50,000). A total estimated cost of $190,000 compares 
favorably with the cost of a commercial product. However, the research funding needed to 
support such a facility must be larger than a single $ 100,000/year grant. 


DEEP-REASONING FAULT DIAGNOSIS: AN AID AND A MODEL 


Wan C. Yoon and John M# Hammer 

Center for Man-Machine Systems Research 
Georgia Institute of Technology 
Atlanta* Georgia 30332 


ABSTRACT 

The design and evaluation are presented for knowledge-based aiding for a 
human operator who must diagnose a novel fault in a dynamic, physical system# 
Since the operator must employ deep reasoning about system behavior to diag- 
nose such a fault, the performance may be restricted by cognitive limitations 
and biases# A computer aid based on a qualitative model of the system was 
built to help the operator overcome some of his/her cognitive limitations# 
This aid differs from most expert systems in that it operates at several lev- 
els of interaction which are believed to be more suitable for deep reasoning. 

Four aiding approaches, each of which provided unique information to the 
operator, were evaluated# The aiding features were designed to help the 
human's causal reasoning about the system in predicting normal system behavior 
(N aiding), integrating observations into actual system behavior (0 aiding), 
finding discrepancies between the two (0-N aiding), or finding discrepancies * 
between observed behavior and hypothetical behavior (0-H aiding)# Three 
experiments were conducted to evaluate the aiding approaches and to investi- 
gate the nature of deep-reasoning diagnosis# Human diagnostic performance 
improved by almost a factor of two with 0 aiding and 0-N aiding. The results 
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from the experiments were integrated into a model of human information pro- 


cessing in causal reasoning diagnosis. 


INTRODUCTION 

Becoming more of a monitor and supervisor in today's highly automated 
systems [Rasmussen 1984], the human operator must at times be involved in 
the task of diagnosing system failures, which is increasingly difficult as 
the system becomes more complicated and automated. The prevalent approach 
to fault diagnosis is to train the operator to have better knowledge and 
experience with commonly expected faults. The training might teach the 
operator to use symptoms to distinguish faults and to follow procedures to 
correct them. While this approach should be successful with common faults, 
it does not support diagnosis of novel faults. 

Another, more recent approach is to support the human operator via 
expert systems for diagnosis. Those expert systems are typically based on a 
large collection of diagnostic rules, which associate symptoms to causes and 
generate tests. As for novel failures, many expert systems for diagnosis 
[Shortliffe 1976, Miller, Pople, and Myers 1984] are based on shallow rea- 
soning: a set of symptoms suggests a diagnosis. This mapping is based on 
experience rather than a system model. Consequently, such systems are sub- 
ject to the same limitations as training and procedures. The expert system 
designer has to anticipate the failure for the expert system to solve it 
correctly. 
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Aiding Based fin A System Model 


To diagnose an unanticipated, unexperienced fault, the operator must 
rely on his/her understanding of causality of the system [Davis 1984]. Such 
causal reasoning is usually a very demanding cognitive task when the system 
is complex. Therefore, an intelligent aid should be able to support the 
operator in causal reasoning about the system behavior. The most obvious way 
to achieve this is to let the aid run its own causal model of the system and 
provide the results to the human. A qualitative model of the system can be 
useful for this purpose. 

Another advantage of an aid based on a causal model is that it should 
be more reliable and robust. The system knowledge is represented at the com- 
ponent level. Because components are small and comprehendable, it should be 
possible to create representations that are correct, perhaps even provably 
so. A system fault can be expressed as a combination of component faults 
which does not require a priori identification of the system fault itself. 

Thus, an aid based on a causal system model can cover a wider range of 
faults. 

In spite of the power of the intelligent aid, we believe there are 
several reasons to keep the human in command of the problem solving. First, 
the current trend of automatic diagnosis is based on large rule-bases which 
are less useful in novel fault diagnosis. Second, the human and the aid may 
be better able to find a solution cooperatively than either can alone. This * 

is possible, even necessary, because the human has better pattern recogni- 
tion capabilities and can make inductive leaps. Third, in many cases, diag- 
nosis is one of the subgoals and may interfere with other subgoals. For 
example, when diagnosis involves operating the system (e.g., opening valves. 
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starting motors), it may interfere with the subgoal of system safety. The 
human is better suited for the responsibility of resolving tradeoffs in pur- 
suit of an overall goal. Lastly, the human may need to resolve ambiguities 
inherent in the aid's model or even to extend the model. 

snhnpHmaiities in Human Problem Solv i n g 

The aid is designed to mitigate human suboptimalities that occur during 
decision-making and troubleshooting [Wickens 1984]. Two categories of 
suboptimalities used here are knowledge-limited and cognition-limited. The 
knowledge-limited suboptimality is simply that the operator does not fully 
understand the system. Obviously, the aid's model is a basis for compensat- 
ing for this problem* 

Cognition-limited suboptimalities are of more interest when the system 
fault is novel rather than common. Novel fault diagnosis requires causal 
reasoning about the system, which is a cognitively very demanding task. The 
operator should repeatedly run a mental model of the system m multiple 
modes as well as maintain a diagnostic procedure. The required information 
processing can overload the operator's limited mental resources, especially 
attention and working memory. The results may be incorrect reasoning or 
inefficient use of information. 

To help, the computer aid can process and display useful information so 
that the operator can use it. This may improve the system performance in 
two ways. First, the operator can dynamically allocate some subtasks to the 
aid and concentrate on others. This leads to lessened mental workload and 
improved performance on those subtasks undertaken by the operator. Second, 
since the aid reasons in parallel with the human, the human can confirm 
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his/her results against the aid's results. When the human overlooks some 
useful information or is affected by some biases, discrepancies would be 
noticed between the aid's results and the operator's own. The operator may 
then adopt the aid's result to be used in subsequent reasoning. For exam- 
ple, when the human and the aid evaluate a hypothesis, the confirmation bias 
(i.e., the tendency to seek only confirming evidences) will be prevented 
since the aid, being not susceptible to this bias, would report disconfirm- 
ing evidence* 

Research Questions 

It is likely that not every plausible form of aiding will improve 
operator performance. Some information which is both relevant and helpful 
may not be able to improve human performance because the human fails to 
incorporate the information into his/her problem solving* This leads to 
another question: which types of information are easily usable by the human? 
Our approach to answering these questions was, first, to build an aid based 
on the best principles available to us, and let the aid supply prospective 
types of information in experimental settings to evaluate their actual aid 
ing effects. Successful and unsuccessful aiding may also provide insight on 
the architecture of human information processing. 

In the subsequent sections of this article, we will discuss the suit- 
able form of interaction for a deep-reasoning aid, the system which served 
as the context of problem, qualitative modeling of the system, the features 
of the aid, the experiments and results, and a model of human information 
processing in causal reasoning diagnosis* Because a literature review was 
included in recently published, early report of this research [Yoon and Ham- 
mer 1987], no review appears here. 
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LEVELS OF INTERACTION 


In the design of interaction between the aid and the human, it is 
important to consider the nature of task to be aided. Deep-reasoning diag- 
nosis has many subprocesses of which even the problem solver may not be 
aware. The aid should be able to help the human's processing without dis- 
turbing or interfering with it* 

To discuss appropriate forms of interaction in this situation, we stra- 
tify the ways in which the human and computer interact into five levels in 
terms of intrusiveness (Figure 1). The two extreme (i.e., the most 
intrusive) levels are the human-direct level and the computer-direct level. 
In the middle, the human-suggest and the computer-suggest levels allow a 
problem solver, the human or the aid, to be moderately intrusive. Finally, 
there is the independent level at which neither problem solver influences 
the other. This stratification is orthogonal to the levels of required 
intelligence or knowledge the aid should have [Greenstein 1980]. 

At the human-direct level , the human assigns tasks to the computer. 
For example, the computer will respond to the operator's request to perform 
a subtask or to answer a question. The situation is opposite at the 
rnmrmfer-direct level : the computer asks the human for some information or 
to perform some tasks. The human does not have a choice other than to follow 
the request. 

Typical expert systems use only these two levels of interaction; some 
systems use only one of the two, others use both. At either level, the 
overall processing is serial and requires explicit communication. Certainly, 
this property does not promote the human's deep reasoning. The difficulty 
of human-direct level interaction is that the effectiveness of the aid 
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depends upon the ability of the human to decompose the overall task into 
modular subtasks IWickens 1984]. On the other hand, at the computer-direct 
level, the human does not have the freedom to pursue his/her own processing. 
This would reduce the benefit to the system of having the human whose flexi- 
bility and inductive and pattern recognition capabilities are superior to 
those of automation* 

At the human— suggest Uvfil of interaction, the human may impose con- 
straints on the computer's processing. Examples are adjusting weights of 
different criteria, modifying the computer's intermediate results, or res- 
tricting the computer's attention to some area in the problem space. How- 
ever, the computer will continue its tasks without explicit assignment by 
the human; only the data or criteria are modified. The CQaput fi£ -8Ugee8t 
level allows the computer to provide some information or warning to the 
human. The human is free to attend or not depending on his/her assessment of 
situation. The operator may postpone a response until finishing a current 
line of reasoning; or, the computer can be completely ignored. Thus, the 
communication is allowed to be less explicit and more abstract. What 
becomes a critical issue is that the suggestions by the computer need to be 
compatible with the human's reasoning process. 

At the independent level , both problem solvers pursue their own problem 
solving procedures without influencing each other. This level is almost 
non-existent in conventional expert systems which employ only the two 
extreme levels. When the interaction occurs at the suggest levels, however, 
the independent level fills the intermissions between suggestions. While 
there is no interaction, both problem solvers may be highly active m their 
problem solving. At times, the deep-reasoning process needs to be supported 
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by interruption-free independence. 

We believe that the three middle levels should facilitate more adequate 
aiding to deep-reasoning tasks. At those levels, the processing is more 
parallel and both problem solvers have more freedom. Two human problem 
solvers would interact mostly at those levels; they would suggest, take com- 
ments and hints, or be silent. Using the three levels of interaction was 
one of our principles in building the aid for novel fault diagnosis. 
Another related principle was to consider compatibility of aiding informa- 
tion with human information processing. 

THE SYSTEM AND THE TASK 

The Orbital Refueling System (ORS) , a NASA-designed payload on the 
Space Shuttle, was selected for study [NASA 1985]. The function of the ORS 
is to refuel orbiting satellites with hydrazine, with the objective of 
extending their useful service life. As shown in Figure 2, the ORS fluid 
system contains a variety of components such as tanks, valves, pipes, etc. 
The operator controls the simulated ORS by opening and closing valves. 
Transferring fuel from propellant tank 1 to propellant tank 2 might proceed 
as follows. First, tank 2 pressure is reduced by momentarily opening valves 
10, 11, 13, and 17. Second, tank 1 is pressurized by opening valves 1,3, 
and 7. Gaseous nitrogen will flow out of the two small supply tanks, be 
pressure regulated, and fill tank 1 on one side of the bladder. To transfer 
fuel to tank 2. valves 5, 14, 15, 16, and 9 would be opened. Because this 
version of the ORS was for demonstration purposes, all transfers take place 
between the two large tanks rather than to a satellite fuel tank. There are 
several assemblies whose purpose was not explained in the above example. 
The relief valves RVl and RV2 serve as a safety pressure relief. Check 
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valve CV1 prevents backflow into the gas system. The bladders in tank 1 and 
2 serve to isolate the fuel from the propellant and also to contain the fuel 
in the weightlessness of space. Some components (e.g., valves 10 and 11) 
may seem redundant; they are so by design for two failure tolerance. 

Nomenclature 

In discussing the ORS and the operator's actions and diagnosis, we have 
found the following nomenclature useful. A cQaPQ H fi II& i» the smallest unit of 
the ORS system that is modeled in isolation. Typical components include 
valves, tanks, pipes, regulators, sensors, etc. The entire set of com- 
ponents, working together according to the qualitative dynamics, is a AXar 
ifijtt. A path is a connected set of components, which could be either a 
graph-theoretic path or tree. 

Components have states . For example, a valve may be open, closed, or 
leaking. The state is what the component is actually doing. A command fed 
Bfafe i s the state to which a commendable component asked to assume. For 
example, a valve may be commanded open or closed. A component also has a 
hphflvi or mode , such as fail-open or normal. The behavior mode describes the 
states which the component takes in response to commands and external condi- 
tions. For example, a fail-open valve is always open, regardless of the 

command* 

The Diagnosis Task 

The operator's task is to diagnose the failure in the system. This 
requires the operator to manipulate and observe the system, because a diag- 
nosis cannot be determined uniquely from an observation of a state vector at 
a single point in time. A solution is an assignment of states to components 
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such that the assignment's behavior is always identical to system behavior. 
For a single valve failure, the solution would be a normal state for all 
components save the failed valve, which might be jammed shut. The diagnosis 
problem can he viewed as a combinatorial search for . state assignment. The 
search is constrained by the laws of component physics. That is. a state 
assignment to a component impose, constraint, on it. neighboring components. 
For example, if a valve is opened and permit, a flow down a pipe, the com- 
ponent receiving the flow must be in a state to accept the flow. 

QUALITATIVE MODELS OF CONTINUOUS PHYSICAL PROCESSES 

This section describes qualitative models: representations, the compu- 
tational problems solved, and the specific needs of our aid of the qualita- 

tive model* 

A qualitative model is a symbolic representation of a system. Its most 
basic description is of a component. A component is described in terms of 
its connections to other components and its behavior. Behavior is described 
in terms of the physical variables which are present at its connections. 
The differentiation between the structural description (connections) and the 
behavioral description is particularly important for insuring the robustness 
of a qualitative model. The isolation of each component in the behavioral 
description has usually been emphasized by other qualitative modeling [De 
Kleer and Brown 1983]. Contrarily, our qualitative model represents the 
system at both the component level and at an aggregated level as paths. The 
motivation for this is the belief that a multi-level description is closer 
to the operator's internal model of the process. In fact, more effective 
communication between our model and the human operator was enabled by the 
use of the higher level description. 
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From a given state, the behavior of a component is described m terms 
of the physical variables present at its ports. A physical variable (and 
its time derivative) may take several values. The time derivative usually 
has only one of three possible values: negative, zero, or positive. The 

variable itself may take either nominal or ordinal values. The nominal 
values usually correspond to points at which behavior (compon 
material) changes. For example, water temperature would have nominal values 
at freezing and boiling. Variables may also take on ordinal values (or 
relationships). For example, water temperature could be taken to be greater 
than freezing and less than boiling# 


The nominal and ordinal valoea taken by phyaical vanablee are said to 
occur in a Quantity Utfg* iForbua 1984. Kuiper, 1984). The quantity space 
i, a partial ordering on the physical variable vines it contains. The par- 
tial ordering occurs because not all comparison, are relevant to understand- 
ing the physical system qualitatively. For example, consider a valve 
between two tanks, A and 8. When the valve is opened, the resulting 
behavior is determined by the pressures in two tanks. The pressure at other 
unconnected points in the system is unrelated to the above behavior. 


AIDING WITH A QUALITATIVE MODEL 

This section describes how a qualitative model is used as a foundation 
for aiding. First, each window of the interface will be described. Four 
different aiding strategies and the motivation for each of them will then be 
presented. Each strategy emphasizes different type of aiding information. 
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ORS Interface 

The interface has four windows: schematic, interaction, sensor display, 
and hypotheses (Figure 3). The schematic window displays a schematic 
diagram of the ORS. The schematic always shows the commanded state of the 
valves. The interaction window is where the operator's commands are echoed 
by the interface. The commands available to the operator include the fol- 
lowing : 

(1) Opening and closing valves. 

(2) Comparing two pressures. On a real physical system, the numerical 

pressure could be displayed on the schematic. When a qualitative 

model is used to simulate the physical system, there is no absolute 
scale in general to which a pressure can be referred. Instead, a 

pressure can be compared to other pressures in the system by the 

relations less-than, equal-to, or greater-than. 

(3) Display of the first derivative of a pressure (positive, zero, or 
negative) • 

And, when the corresponding aiding feature (it is described more fully m a 
later section) is available, 

(4) Turning the what-if model on and off. 

(5) Making state assumptions in the what-if model. 

The sensor display contains the output from the sensor display com- 
mands: the relationship between two pressures or the first derivative of a 

pressure. When appropriate aiding features are activated, suggested sensor 
readings will also be displayed in this window. 
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. e hy p„th«.s window di.play. . .et of hypotheses that are aet by the 
operator. Theae hypotheae. are .imply atate ...igomeat. to componeot. (e.g., 
valve 13: leaking). Pipe., which do not have name, diaplayed in the 
schematic . are de.ignated a. left or right to named component, auch aa 
valve, and orifice.. For example, the pipe between valve 8 and orifice 4 i. 

designated either £ YS. or I* £&• 

Aiding Approaches 

Baaed on ob.erved human atrategie. of diagnoaia. four aiding approache. 
aeemed to de.erve evaluation. Each approach empha.iaea different informa- 
tion and uaea an appropriate communication mode for the kind of information. 

Tonnor.nhic Aidine . The fir.t and aecond aiding approache. are baaed 
on two preaumed forms of operator cognitive processing. First, the operator 
must observe and infer what the system i. actually doing. This proce.sing 
is termed 0 (Observed) and is concerned with flows, leak, through valves, 
leak, out of pipes, and the general vicinity of the fault. Second, the 
operator needs to generate normal system behavior to compare with ob.erved 
behavior. This processing is termed H (Kormal). TWO obvious forms of aiding 
are to generate 0 and H so that the operator does not have to devote cogni- 
tive processing to generating them. To produce 0, the aid integrates the 
information from the pressure sensors to which it ha. continuous access. 
Like a human operator, the aid has to guess the actual behavior from the 
sensor information since it doe. not know the real system state. In con- 
trast, S is generated by the qualitative model under the assumption that 
every component is in the normal behavior mode. 
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0 and N are displayed topographically. For both 0 and N, the aid 
displays two forms of system behavior: equal pressure paths and mass flow 
paths. The former is the set of components that should be at equal pressure 
given the commanded valve positions. Whenever the operator creates an equal 
pressure path by opening a valve, the path is highlighted. Similarly, a mass 
flow path created by an operation is highlighted as long as it exists. 

Figure 4 is an example of N display. Opening valve 9 was the latest 
change. This would make, if the system were fault-free, the pressure is 
equal through the highlighted path« 

Figure 5 shows the same configuration as Figure 4, except that the 0 
display (rather than N) is activated. Suppose that when valve 9 was opened, 
the pressure P2 began to decrease and Pi increase. This leads the aid to 
believe there is a mass flow from tank 2 to tank 1 (the path is highlighted) 
in spite of the closed positions of valve 8 and valve 15. However, since the 
aid cannot be certain which valve is leaking, it highlights both paths. When 
a precise conjecture is not possible, the aid will take a conservative posi- 
tion as in this example. Note that 0 and N aiding cannot be used simultane- 

ously* 

Differencing Observed And Normal Pehav ifit- The third aiding approach 
is to suggest observations that reveal the differences between the observed 
system behavior and the normal system behavior. This difference will be 
referred to as 0-N. The importance of 0-N in ORS diagnosis was discussed in 
connection with the results of our preliminary experiment [Yoon and Hammer 
19871. Such a deviation from normal behavior, when observed and correctly 
interpreted, helped effectively reduce the size of the feasible hypothesis 
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get . Figure 6 shows au example o£ this feature a display in the eame 
tion a. of Figure 4 and 5. The aid augg.ata. for example, to ia.ue a command 
a a., which i. to inquire the fir.t derivative of FI. Whan the operator fol- 
lows this, he/she will find FI is increasing, which is opposite to the com- 
manded situation (no flow should he possible from either GIF or TK2G/L) ■ 

ttg What-if Model . The fourth, and the last, aiding feature is closely 
related to the above. This feature can use any hypothetical behavior 

(denoted by H) , instead of the normal behavior, with which to difference 
observed system behavior. The operator can freely set or remove hypotheses. 
Then, the aid -ill run a what-if model based on the hypotheses in place of 
the normal model. Any discrepancies (denoted by 0-H) will be reported in the 
same -ay. If the hypothesis is. incorrect and the observed and hypothesized 
bevavior differ, the aid will recommend readings that indicate the differ- 
ence. If the hypothesis is correct, the aid will produce no recommenda- 
tions. For example, suppose valve 8 is leaking to allow a flow from tank 2 

to tank 1. If the operator's hypothesis is a leak in the pipe between valve 

10 and 11, the feature would present a display shown in Figure 7. If the 

hypothesis were right, FI should not increase. In this example. PI does 
increase, so the sid recommends a reading D EL. Also, the hypothesis does 
not explain the difference between P2 and P4. Wore that if no hypothesis is 
stated, the recommendations would be the same as the previous example (i.e., 

0-H - 0-N if H - N) . 

The common motivation for these aiding approaches is to perform compu- 
tations that the operator is believed to do when diagnosing the system. As 
much as these computations are related to the human's mental model, the 
qualitative model in the aid may be an appropriate vehicle to help or 
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replace the computations. There are two ways this approach might help. 
First, the operator may have an incorrect or incomplete mental model. 
Second, the operator may have difficulty integrating correct component 
behavior into correct system behavior because of cognitive limitations. The 
aiding approaches support different uses of the mental model: to envision 
the normal or hypothetical behavior, to conjecture the actual behavior, and 
to describe the difference between behaviors of two (e.g., 0 and H) models. 
This does not mean the operator need not understand the system at all; he or 
she still needs to understand the meaning of aid's information and select 
the hypotheses. 

THE EXPERIMENTS 


Overview fif Experimental Pesi gii 

To evaluate the types of aiding information, three separate experiments 
were conducted. The first experiment tested the effects of N information. 
The next experiment compared the effects of 0 and 0-N against unaided diag 
nosis. The last experiment focused on hypothesis testing and evaluated the 
aiding effects of 0-N and 0-H. 

The display of aiding information prevented those features from being 
tested together. A subject must not be exposed to both N and 0 features 
since severe interference, perhaps in the form of a carry-over effect, was 
expected. This is because the display of 0 and N information is identical 
but each carries a different meaning. 0-H and 0-N for the same reason 
should not be used together. When 0-H is used, it acts as 0-N until the 
subject expresses one or more hypotheses. This makes a direct comparison 
between 0-N and 0-H difficult. Even if 0-H really improves the performance. 
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its contribution will be depend on the extent to which a subject uses it. 
Therefore, a different experimental setting needs to be employed to evaluate 
the potential benefit of 0-H. The above considerations led to the three 

separate experiments. 

In all three experiments, replicated Latin square designs were employed 
[Edwards 1972]. Differences in the complexity of problems and differences 
between users were expected to introduce large variation in the performance. 
It was therefore desirable, in order to enhance the efficiency of the exper- 
iments, to select problem and subject as two blocking variables. Such 
designs are called within-sub jects designs for each subject serves in more 

than one treatment level* 

A Latin square design, if its assumptions hold, should be more economi- 
cal than a corresponding complete block design. Even without considering 
economy, our experiment does not allow a complete block design. Because a 
subject should not be given a same problem more than once, be/she can be 
assigned only one level of treatment for each problem. 

In a Latin square design, the positions of each treatment level are 
counterbalanced: namely, each treatment occurs at each test position with 
equal frequency. This prevents possible practice effects from being con- 
founded with treatment effects. Instead, practice effects are then con- 
founded with test positions (i.e., problem). However, the problem factor is 
merely a blocking variable and we were not interested in the significance of 
its effects. Also, the training was designed to stabilize the subject's 
performance and thus minimize learning effects. 

One possible problem with a within-sub ject design is that the value of 
an observation for one treatment may be influenced by the effects of 
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treatment® applied during earlier period.. When thi. ari.ea. the treatment 
i. referred to a. having carry-over effect.. The influence of thi. effect, 
if any. may he partially compen.ated for by adopting a balanced Latin .,u.re 
design, in vhicb each treatment follow, every other treatment the same 
number of times. When the number of treatment, i. odd. then at least two 
Latin square, are required to achieve thi.. Thi. replication also permit, a 
larger number of data point.. All our experiment, were designed following 
the above principle.. The resulting design, are presented in Figure. 8. 9 

and 1 0 • 

While the balanced Latin square designs may compensate for the above 
problems, they are based on several assumptions. A key question concerning 
the Latin square design model is whether the effects of blocking variables 
and treatments are additive: since there is only one observation per cell, a 
Latin square design model assumes additivity to estimate the error variance. 
If nonadditivity is present in the data, the use of a model assuming 
tivity will lower both the significance level and the power of the test for 
treatment effects. Thus, the Tukey test for additivity was conducted when- 
ever we applied a model to the data [Neter and Wasserman 1974, pp.780l. 

While homogeneity and normality of error variance are the basic assump- 
tions in an ANOVA model, it is known that the F test is not much affected by 
deviation from these conditions [Lee 1975, pp.284]. However, a residual 
plot of error terms against expected cell means can reveal the need for 
transformation of dependent variables. Since a transformation would affect 
the interpretation of treatment effects, residual plots were examined in 

every analysis. 
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Bvperiment L 

The purpose of this experiment v.s to compere » sided end un.ided diag- 
nosis. It is re.son.ble to expect diagnostic performance to be improved 
vhen the envi.ionment of normal system behavior is improved. In our pre- 

experimental observations, however , .. observed that most subject, found 

i ci'nf* its effectiveness was doubtful 

this aiding confusing or irrelevant. Since 

based on this observation, it vas evaluated first. 

Six industrial engineering student, volunteered to serve as subjects. 
They were trained through two session, (total 3.5 ' 4.0 hour.) to ac,uire 
enough knowledge of fluid dynamic, and element, of diagnostic procedure. The 
goal of our training was to teach the subject, correct causal reasoning 
ibout the ORS and give them reasonably stabilised diagnostic skills. How- 
ever. if a subject is exposed to a kind of problem several time, in a short 
period, the subject may develop some mechanistic diagnosis procedure, that 

do not require causal reasoning. When a similar Pr°«» “ E iv<s “' th * ‘ Ub " 
jects may try to deal with it as a routine failure rather than a novel one. 
We felt that a longer training may increase this possibility since 
plexity of our version of ORS is only moderate. 

Training session 1 started with basic principles derived from fluid 
dynamics. Then, possible malfunction, for each component were discussed. 
Finally, the subjects undertook a simulated ORS mission, during which envi- 
sioning of normal system response was practiced. Session 2 taught elem 
tary diagnostic procedures such as checking a sensor bias 

The subject then was required to plan testing procedures for five typical 

hy poth Each developed procedure was discussed by the experimenter 

eetil the subject developed (and understood) a correct procedure. Hext. 
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Sessions 1 and 2 took 1 .5 


three real problems were given as exercises, 
hours and 2 hours on the average, respectively. 

The performance of the subject in the entire training sessions was 
cioB.ly monitored* The fir.t eee.ioo cont.i.ed many que.tioo. to e.cert.io 
if the .object .thieved proper coder. t.odicg* The ...were vere checked dur- 

ing the ... ion .od. vheoever oece...ry. di.co.eed .gain* Problem 

.olving exerci.ee vere . 1.0 .trended by the experimenter .nd nece...ry di.- 
cus.ion or re-expl.n.tion v.. provided. The re.olt ... th.t the initially 
poorer .object, .odd .pend more time in training r.ther th.n end vith poor 
understanding* By the end of the second ....ion* all the .object, performed 
e.tisf actorily and .hovad little additional improvement in diagno.tic .kill* 

The consideration, vhich led to the design of experiment has been di.- 
cus.ed in the overvie. section. The design for experiment 1 is sho.n in 
Figure 8. Each group .a. composed of three .object, and the Latin equate 
was replicated three times using different problems. 

Many different performance measure, .ere tried vith our data from the 
pilot experiment. The number of information gathering action. C«CA) and 
the time to solve (Time) appeared to he appropriate performance measures. 
Although several other measure, .ere examined vith the data, they either 
turned out to have insufficient resolution or .ho.ed high correlations «ith 
the above measures. Thus, the above .ere the mo.t important measure, in this 
experiment. Time and #IGA .heed virtually identical behavior both in the 
examination of aptnes. of the AMOVA model and tests of significance. 

The data collected from 36 subject-problem, .ere fir.t analysed to 
determine if there .ere .ignificaot interaction, hetveen problem, and aiding 
levels. The interaction, .ere found in.ignif leant both in time (p - .609) 
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•ni #IGA (p - .534). Thi. suggested that the interaction tem can be 
excluded iron the nod el and ita ann of aquarea nay be pooled vith that of 

error term. 

The Tukey test uncovered nonadditivity in the data of both Time 
#IGA. The reaidual plot indicated that the cell at.ndard deviationa «ere 
proportional to cell average.. A. thi. i. frequently the c.ae vheu the cri- 
terion i. reaponae tin. lie. 1975, pp.2911. a logaritbnic tran.fomation ... 
auggeated. After the traoafomation, the anoualy in the reaidual plot vaa 
fixed. The transformed data, both in Tine and #IGA, appeared to adhere to 
the honogeneity and normality requirement, for ANOVA better than the origi- 
nal .core.. The interaction. bet»een aiding level, and problem, ver. still 
insignificant. The Tukey teat ua. performed again vith the ne. score, and 
showed no significant nonadditivity. 

The contribution of N aiding to both Time and #IGA was on the negative 
side, though not significant (p - .096 and .381, respectively). On the aver- 
age, it corresponds to 31Z increase in Time and 13Z m #IGA. 

These results may not simply be interpreted that N feature did not h p 
the envisionment of normal system behavior or that the role of such envi 
sionment in the diagnosis is unimportant. A proper interpretation may 
that the normal envisionment could not be helped very well by providing 
external information because the process is too quick and deeply embedded in 
a larger cycle of human information processing. Another possibility is that 
envisioning normal system behavior was not a bottleneck in diagnostic per- 

f orxnance* 

We concluded the former interpretation was very likely considering the 
following. First, most subjects, after their main sessions, stated that the 
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«id was not only uninformative, but also somewhat distracting or confusing. 

A subject said he wished he could get 'real' system behavior rather than 
'normal' behavior. Second, the fairly strong negative aiding effects could 
not be explained if the aid helped only unimportant subtasks. Third, the 
negative aiding effect was notably stronger in Time than in #IGA. (This was 
the only occasion in which the two measures showed any notable difference in 
the analysis throughout experiment 1 and 2.) This implies that the aid 
forced the subjects to think for a longer time but did not greatly affect 
their diagnostic procedure. This result supported the subjects in reporting 
that the aid was confusing and distracting. Thus, we concluded that there 
was interference between N information and the operators' diagnostic infor- 
mation processing. Certainly, they do predict normal system behavior as a 
subtask: it is obviously necessary. But, when they seek information from the 
display, it was not of normal system behavior. This observation will be 
implemented in modeling of deep-reasoning diagnosis later in this paper. 

Experiment 2. 

The second experiment was to assess the aiding effects of 0 and 0-N 
features against unaided diagnosis. Nine new subjects, again industrial 
engineering students, were recruited as volunteers. Two training sessions 
which were virtually same as in the first experiment were given. In terms 
of content, the only difference was that the explanation of the new features 
replaced that of N feature. The design of experiment, shown in Figure 9, was 
also the same except for a different number of treatment levels and replica- 
tion. 
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The procedure of statistical analysis was the same as in Experiment 1. 
First, the interactions between aiding levels and problems were found insig- 
nificant. After pooling the sum of squares for interactions into error sum 
of squares, the Tukey test for additivity was performed. Ho significant 
nonadditivity either in Time or #IGA was found. When the residual plots 
were examined, however, it was indicated that both measures needed to be 
logarithmically transformed. After the transformation, the new residual 
plots showed stabilized error variance. Again, the interactions between aid- 
ing levels and problems were insignificant. The Tukey test with the new 
scores yielded a much lower F value than before the transformation, confirm- 
ing that the new scores fit the assumptions of the model better. 

As results of the analysis of variance, both Time (p ■ .0302) and #IGA 
(p - .0005) showed significant effects of aiding. In Time, the improvement 
(i.e., decrease in Time) on the average was 34Z by 0 aiding and 42% by 0-N 
aiding. In #IGA, 0 aiding permitted 40% decrease while 0-N aiding gave 44%. 
Neuman-Keuls tests were performed to determine if there were significant 
differences between pairs of aiding levels. Both 0 and 0-N aiding levels 
had significantly different means when compared to the unaided mean. This 
result was identical for both Time and #IGA. In any measure, there was no 
conspicuous difference between 0 and 0— N aiding. 

The obvious conclusion is that both aiding features were effective in 
both measures and permitted solid enhancement of human diagnostic perfor- 
mance. In contrast to the N feature, these types of information appeared to 
be well accepted by the human process of diagnosis and helped the human in 
some important elements for his/her performance. 
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Experiment 1 


The motivation for Experiment 3 was informal observation of subjects 
during Experiment 2. The effectiveness of 0-N aiding in Experiment 2 
appeared to decrease as the diagnosis proceeded. As is to be supported by 
more elaborate analysis later, this motivated us to investigate possible 
transitions between problem solving phases made by the diagnostician. Prob- 
ably the most notable change in diagnosis as time passes was that the diag- 
nostician began to deal with more explicit and individual hypotheses after 
the feasible hypothesis set size had been sufficiently reduced. In later 
phases with individual hypotheses, the characteristics of problem solving 
may be very different than the earlier phase of narrowing down the 
hypothesis set. Therefore, it was necessary to investigate the nature of 
diagnostic activity and proper form of aiding with such explicit hypotheses. 

Due to its unique purpose, this experiment had an important difference 
in its setting from the first two experiments. In Experiments 1 and 2 , the 
subjects solved whole diagnosis problems starting with primary symptoms. In 
the third experiment, the subjects determined whether a given hypothesis was 
true. At first, instead of being told of symptoms, the subject was allowed 
to perform some predetermined sensor readings which would indicate abnormal 
system behavior. Then, the subject was given a hypothesis to evaluate. 
Without needing to diagnose the real failure, the subject was to end the 
problem solving merely saying if he/she agreed at the hypothesis. 

The effects of 0-N aiding and 0-H aiding were evaluated against unaided 
situations in two separate Latin square designs, i.e.. Experiments 3-a and 
3-b. They are shown in Figure 10. This was because, as mentioned earlier, it 
was not possible to assign both 0-N and 0-H aiding levels to the same 
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subject due to expected interference. Although both Time and #IGA were col- 
lected, only Time was used in formal statistical analysis. Since the prob- 
lems are much smaller in size than those of earlier experiments, #IGA is 
usually a small integer that would not easily lend itself to meaningful sta- 
tistical analysis considering the vast difference in the subjects diagnos- 
tic procedures. Otherwise, the analysis proceeded in a similar procedure as 
that of previous experiments. 

In the analysis of the data from Experiment 3-a, the main question was 
what effects 0-N aiding will have on the performance of diagnosis with a 
given hypothesis. First, the interactions between aiding levels and problems 
were tested and found insignificant (p - .881). Thus, a pooled error sum of 
squares were used for subsequent analysis. The Tukey test for additivity 
revealed the data were indeed additive. The residual plot also confirmed 
the model fitted the data quite well. It may be noted that, unlike the 
former experiments, no transformation was found necessary. The reason 
perhaps lies in the nature of the problems; these problems are just elemen- 
tary subtasks which the operator should do numerous times m a whole diag- 
nosis. As for the whole diagnosis time, the standard deviations were pro- 
portional to the means. That is, when a problem was more complex, the varia- 
tion in the actual diagnosis time tended to be larger. This tendency most 
probably comes from the process of narrowing down the hypothesis set since 
the subtask of hypothesis testing did not show this property. 

The performance was somewhat worse with 0-N aiding than without it. 
Although not significant (p - .192), the difference on the average extended 
to 15.6 seconds (overall average was 67.4 seconds). The interpretation will 
be discussed with the evaluation of 0-H aiding. 
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Experiment 3-b proceeded the same way except 0-H aiding was tried in 
the place of 0-N aiding. The interactions between aiding levels and problems 
were negligible (p - .8593). The additivity test confirmed that the data 
were additive. As in Experiment 3-a, the residual plot indicated that no 
transformation was needed, surprisingly, the effects of aiding appeared to 
be completely negligible (around 1 second, p - .9546). 

The interpretation of these results is subtle. First, the 0-N informa 
tion was not relevant to the operator's activity to test a given hypothesis. 
The aid distracted the operator only to think about irrelevant information. 
This confirmed our earlier observation in Experiment 2 that the aiding 
effects of 0-N information seemed to diminish as the diagnosis proceeds into 
its final stage. This observation, too, became a basis of our modeling of 
deep-reasoning diagnosis which is discussed in a later section. 

Then, why was O-H aiding, which must be relevant to the given 
hypothesis, not effective? Two possibilities occur. First, the 0-H informa- 
tion was simply not relevant to the problem solving. Otherwise, the informa- 
tion was relevant but trivial to the subjects. The first interpretation is 
not consistent with our previous results that, when irrelevant information 
was given to the subjects, the performance showed signs of degradation. The 
remaining choice is that the information, which is basically a set of 
suggestions for interesting observation, was already known to the subjects. 
That is, they already knew what to see even without the aiding; the aid only 

confirms it* 

This interpretation could be further confirmed by a detailed process 
analysis. In Experiments 3-a and 3-b, 32 problems were solved without aid- 
ing. If 0-H aiding had been provided with these problems, it would have 
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In 38 out of the 39 times 


suggested useful sensor reading actions 39 times* 

(97.4Z), the subjects collected equivalent information without it being sug- 
gested. Since they were ready to gather the O-H information whenever it was 
useful, the suggestions for this information by the computer were not able 
to improve the performance further. Because, unlike the 0-N suggestions, 0-H 
suggestions were just what the subjects were about to do, they were under- 
stood as trivial so that no performance , decrement was caused by interfer- 

ence, either* 

There was also an indication that the subjects planned valve operations 
and sensor readings together ahead of the actual operations. The subjests' 
collecting of 0-H information was remarkably precise. There were 5 occa- 
sions in which the 0-H aiding, if had been given, would have suggested unin- 
formative readings. Failing in only one case out of 39 to look at useful 
0-H information, the subjects did not waste their time to do the uninforma- 
tive sensor readings in any of the 5 occasions. Such precision may not be 
possible if the subjects were simply hunting around for useful observations 
by chance in scenes they just created. Most likely, the scenes were pur- 
posely planned aiming at the useful information. It should be noted that 
this tendency was unique and appeared only when an explicit hypothesis was 

given. 

Summary 

To s umma rize, 0 aiding and 0-N aiding improved the diagnosis while N 
aiding did not. Actually, N aiding seemed to have negative effects. This 
suggests that the operator can effectively utilize 0 information, not N 
information, supplied from outside of his/her own information processing. 
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The usefulness of 0-N aiding seemed to decrease over time perhaps as expli- 
cit hypotheses arose. In explicit hypothesis testing, 0-N aiding shoved a 
weak negative contribution while 0-H aiding did not affect the performance 
at all. When weak negative effects were found, there seemed to be some 
interference caused by irrelevant information. On the other hand, 0-H aiding 
was trivial and innocuous. The precision with which the subjects collected 
0-H information indicated that, when a hypothesis was given, the operational 
actions and data collection were usually planned together before the opera- 
tions. This is an important observation in how the operators used their 

mental models. 

A MODEL OF DEEP-REASONING DIAGNOSIS 


Methodology 

In this section, the experimental results will be integrated into a 
model of novel fault diagnosis. 

The overall diagnostic procedure can be viewed as a combination of two 
elements: information processing tasks and a control strategy. Information 

processing tasks are subprocedures of diagnosis which can be characterized 
by their input, output and processes which take the input to produce the 
output. The control strategy is the way in which information processing 
tasks are selected. 

The emphasis in this research has been on the information processing , 
tasks, not the control strategy. There are several reasons for this. First, 
aiding novel fault diagnosis is the goal. Such diagnosis relies on causal 
reasoning about the system. To help causal reasoning, information processing 
tasks in which causal reasoning is embedded need to be understood. Second, 
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we wanted to evaluate an aid which would be able to help the human to over- 
come cognitive limitations by some extent. While the aid would possess a 
similar causal reasoning capability to a human, it would not suffer the same 
cognitive limitations. This aid would be a more direct help to information 
processing tasks rather than the control strategy. Third, the findings from 
our research would permit insights to the structure of these information 
processes since our aiding approach was to provide various types of informa- 
tion which would substitute for the operator's information processing. 

The emphasis on information processing led to a description of data 
flows rather than a flow chart. A flow chart would depict how the chronolog- 
ical sequence of various processes is controlled. In contrast, a data flow 
diagram would describe the necessary information input to a process, the 
expected output from a process, and the organization of processes through 
the links of information. This diagram helps to identify necessary sub- 
processes and alternative ways of automation. 

A basic assumption connects our aiding experiments and the human infor- 
mation processing model: the human can better incorporate external informa- 
tion into his/her processing when the information becomes an alternative 
input to one of the higher level processes. An information processing task 
can be broken into processes , each of which can be broken into aubprocesa fiA. 
We assume that aiding information can be substituted for an entire process 
more effectively than for just an individual subprocess. There are several 
reasons to believe this assumption is reasonable. Because they are inner 
cycles in processes, subprocesses iterate and require input at higher rates. 
Also, the operator's working memory is more heavily loaded during a subpro- 
cess since the status of the higher level process, as well as that of the 


29 



subprocess itself, should be retained. With the frequent cycles and heavy 
mental workload, it would be harder to perceive and apprehend externally 
supplied information IWickens 1984, Rasmussen 1984]. 

As far as causal reasoning of the system operation is concerned, two 
directions of information processing should exist: observations to 

hypotheses and hypotheses to observations. The former task takes observa- 
tions as input and produces hypotheses, while the latter starts from 
hypotheses and identifies necessary observations. Both tasks may be 
categorized as a&AZZh hi Evaluation according to Rasmussen's classification 

[Rasmussen 1984]* 

Observa tions £4 Hypotheses. 

This task is triggered by observations of system behavior and will be 
referred to as data- driven It occurs when the observations were 

collected without particular hypotheses or showed unexpected patterns that 
fell outside hypotheses of interest. It seemed therefore natural that the 
subjects performed this type of process more often in earlier phases of 
diagnosis. Since 0-N aiding was useful in earlier phases, the information 
it supplied must be closely related to this task. The poor performance of N 
aiding, however, indicates that the human's use of N information is in a 
lower level subprocess, very likely to produce O-N information. Therefore, 
it is suggested that there is a process which filters the observations to 
pass only more interesting (i.e., unexpected) ones to the next process: N 
information is used for one of its subprocess. Obviously, there must be one 
more process to complete this task. In this second process, the human tries 
to come up with a set of plausible hypotheses that explain the observations. 


30 


Some of the interesting observations may be remembered to evaluate future 
hypotheses throughout the diagnosis. The above constraints allow one to 
conceive a model of the data-driven search as represented in Figure 11. 

Two processes were identified. The first process is filter i n g fibaeiV flz 
Hons . Only the observations which passed this filtering are used in the 
following process of entertaining hypothec . The filtering process con- 
tains a reference mental mndnl of the system. The reference model is a men- 
tal model that produces standard behavior against which observed system 
behavior is continuously compared and judged as expected or unexpected. At 
first, the reference model behavior is that of normal system. As more 
observations are accumulated, however, some abnormal system behavior would 
also become expected even though the reason may not be understood. An 
expected observation does not carry additional information and should be 
filtered out as trivial. Thus, the reference model should evolve incorporat- 
ing more and more observations of actual system behavior. Converging to the 
actual system in its behavior, the reference model would lower the probabil- 
ity of unexpected observations. Consequently, the efficiency of unplanned 
observations would decrease and the data-driven search would become less 
useful as the diagnosis proceeds. 

In earlier phases of diagnosis, when the reference model behavior is 
normal, 0-N aiding replaces the whole filtering process and provides input 
information to the hypotheses entertaining process. According to our basic 
principle, it should be easier for the human to incorporate such information 
into his information processing. This was supported by the experimental 
result that O-N aiding improved the diagnostic performance. However, the 
gradual departure of the reference model from normal system behavior would 


31 



degrade the relevance of 0-K aiding in the filtering. It v.a anpported by 
the obaervation that 0-11 aiding va. .o.tly o.eful in earlier phase, of prob- 

lem solving* 

0 aiding enhanced the observations which are input to the filtering 
process. The enhancement is in fact presentation of observed system behavior 
at a higher level of abstraction than the sensor displays iRasmussen 1984]. 
For example, while the operator would normally look at individual pressure 
points to check the system behavior, 0 aiding would display a mass flow 
which is not the behavior of a component, but of a path. Since this level, 
being more functional, allowed more appropriate information coding for the 
operator's use, it should improve the filtering process. The experimental 

results supported this* 

The prediction of normal system behavior (N aiding) is at first 
equivalent to the subprocess of running the reference model. This activity 
is internal to the filtering process, neither replacing a process nor pro- 
viding better information to a process. As a result, there may be little 
chance to improve human diagnosis by providing this information from out- 
side. Actually, the experiment showed that H aiding had rather negative 
effects, though not significant, perhaps due to distraction. 

Hypotheses ££ Qhsprvationa 

When given hypotheses are to be evaluated, the operator would build a 
testing plan that may prove one hypothesis and disprove the rest. This task 
is called hvnothesis-drivgn Experiment 3-a indicated that, by demon- 

strating poor performance of 0-N aiding, this task was very different from 
data-driven search in its information processing. 
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This type of process tends to be employed more often toward the final 
stage of diagnosis as the data-driven search loses its efficiency. An impor- 
tant restriction of this process is that the hypothesis should be suffi- 
ciently explicit for the diagnostician to perform mental simulation based on 
it. There are usually too many explicit hypotheses that are feasible in 
earlier phases of diagnosis. Therefore, the data-driven search may be pre- 
ferred in narrowing down the feasible hypothesis set. Toward the end of 
diagnosis, however, the number of feasible hypotheses would become smaller 
and the need of testing the remaining hypotheses individually would 
increase. Then, the hypothesis-driven search dominates the diagnosis. 

In Experiment 3-b, we forced the subject to perform this process by 
assigning a hypothesis to test. The experimental result that 0-H aiding did 
not improve the human diagnosis can be explained in this model. 0-H aiding 
suggested sensor readings which would show the difference between actual and 
hypothetical system behavior. When the hypothesis is false, a right test 
would reveal the existence of 0-H behavior to disprove the hypothesis. 
Thus, 0-H information is certainly relevant to the hypothesis testing. It 
is reasonable to expect 0-H aiding to be helpful if the operator collects 
observations and filters them as in the data-driven search. If, however, 
the tests are planned by predicting observable differences (as in Figure 12) 
depending on whether the hypothesis is true, 0-H information is identified 
before the actual testing operation. In this case, externally suggested 0-H 
information would only be redundant and would not improve the performance. 

The latter case was supported by the experiment; the aid gave no per- 
formance improvement; the operators collected 0-H information in an 
extremely efficient manner even in unaided diagnoses, in which they were not 
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given the suggestions by the atd. Therefore* it is safe to conclude tbst 
the operator, when a hypothesis is given, runs his/her mental model to 
determine a test that would distinguish the given hypothesis from other 
hypotheses. Figure 12 describes the model of this task. 

Control Strategy 

The control strategy is both highly dynamic and individualistic. 
Operators switch frequently between information processing tasks. The 
selection of tasks depends on the assessment of relative efficiency and 
effectiveness of different tasks in different situations. For example, if 
the diagnostician is equipped with very inexpensive testing methods to check 
every component directly, the cost of hypothesis-driven search will be 
drastically reduced from what it is in the ORS diagnosis. This observation 
suggests the possibilty that the control strategy can be changed when aiding 
affects the efficiency of elementary tasks. 

Although the two information processing tasks are the most important 
elements, the strategy may involve other types of information processing. 
Topographic search [Rasmussen 1984] can be used either to entertain 
hypotheses or the necessary observations for a hypothesis. In fact, this is 
believed to be the frequent way in which the operator, when performing 
data-driven search, selected the data to begin with. 

Regarding the control strategy, the only observation we could be 
assured of was that the subjects gradually transitioned from data-driven to 
hypothesi6-driven search as the diagnosis proceeded. This was perhaps 
because the reduction of the size of feasible hypothesis set changed the 
relative efficiency of two processes. For instance, with only one 
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hypothesis to deal with, explicit planning of test by hypothesis-driven 
search oust be more efficient. It may also be partly because, as we have 
already discussed, the data-driven search lost its efficiency as observa- 
tions were accumulated* 

As a conclusion, the detailed modeling of information processing tasks 
helped to integrate our findings and observations of human operator's novel 
fault diagnosis. The models of human information processing tasks were use- 
ful in explaining the aiding effects of various types of information. It 
should also be useful to predict effects of aiding to be proposed in the 
future. Such predictions, in turn, may be tested in experiments to verify 

the model. 

CONCLUSION 

An aiding approach has been described and evaluated for novel fault 
diagnosis in complex systems. To the best of our knowledge, this approach 
is unique in the following ways. First, the emphasis is on novel rather 
than routine faults. Second, it contains a qualitative model that may 
correspond to the human's internal model of the system. This model 
represents knowledge only of how the system behaves. Therefore, this aiding 
approach does not rely on proceduralized knowledge. Third, the qualitative 
model is the basis for much of the aiding that takes place. 

The experimental results confirmed that a deep-reasoning diagnosis can 
be aided, without disturbing the human diagnostic procedure, by providing 
relevant information. However, the results also suggested that the aiding 
information should be compatible with the human information processing. 
This emphasizes the importance of understanding the human information pro- 
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cessing to build an effective aid. A principle of particular importance is 
that the information from/ to higher-level processes is better incorporated 
into the human's information processing. The findings and observations were 
integrated into an effort to model the information processing tasks for 
deep-reasoning diagnosis. 
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Figure 3. The operator's display. 
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The observed response (0) 
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Figure 6. Deviation from normal behavior (0-N). 
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Figure 7. Deviation from hypothesized behavior (0-H) 
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Figure 8* Latin Square Design for N effects in Experiment 1 
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Figure 9. Latin Square Design for 0 and O-N in Experiment 2. 
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Figure 10. Latin Square Designs for O-N and O-H effects 
in Experiment 3. 
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